Applying Artificial Immune System for Outlier Detection: A Comparative Study
نویسندگان
چکیده
Outlier detection is a data mining method for discovering exceptional, abnormal or suspiciously unusual samples in a data set. Outliers typically represent the data rich but information poor dilemma. Data mining methods are applied to solve this problem in broad range of application fields like credit card fraud detection, network intrusion detection, error extraction, clinical disease researches and sport statistics. Besides classical distance based outlier detection techniques, nature-inspired evolutionary approaches exist for outlier detection. However, except some limited application fields, artificial immune systems are not applied to the fundamental outlier detection problem. In this study, we use the Artificial Immune System Algorithm to solve the outlier detection problem. We compare the outlier detection performance of the Artificial Immune System with K-Nearest Neighbor Algorithm, Distance Based Outlier Detection Algorithm and Box Plot method, using one artificial and two real-life datasets. When we compare the results, we found out that Artificial Immune System Algorithm gives better results and works with a lower error rate than the distance based methods.
منابع مشابه
Outlier Detection in Wireless Sensor Networks Using Distributed Principal Component Analysis
Detecting anomalies is an important challenge for intrusion detection and fault diagnosis in wireless sensor networks (WSNs). To address the problem of outlier detection in wireless sensor networks, in this paper we present a PCA-based centralized approach and a DPCA-based distributed energy-efficient approach for detecting outliers in sensed data in a WSN. The outliers in sensed data can be ca...
متن کاملA Comparative Study of RNN for Outlier Detection in Data Mining
We have proposed replicator neural networks (RNNs) for outlier detection [8]. Here we compare RNN for outlier detection with three other methods using both publicly available statistical datasets (generally small) and data mining datasets (generally much larger and generally real data). The smaller datasets provide insights into the relative strengths and weaknesses of RNNs. The larger datasets...
متن کاملOutlier Detection by Boosting Regression Trees
A procedure for detecting outliers in regression problems is proposed. It is based on information provided by boosting regression trees. The key idea is to select the most frequently resampled observation along the boosting iterations and reiterate after removing it. The selection criterion is based on Tchebychev’s inequality applied to the maximum over the boosting iterations of ...
متن کاملOutlier Detection Using Extreme Learning Machines Based on Quantum Fuzzy C-Means
One of the most important concerns of a data miner is always to have accurate and error-free data. Data that does not contain human errors and whose records are full and contain correct data. In this paper, a new learning model based on an extreme learning machine neural network is proposed for outlier detection. The function of neural networks depends on various parameters such as the structur...
متن کاملCredit Card Fraud Detection using Data mining and Statistical Methods
Due to today’s advancement in technology and businesses, fraud detection has become a critical component of financial transactions. Considering vast amounts of data in large datasets, it becomes more difficult to detect fraud transactions manually. In this research, we propose a combined method using both data mining and statistical tasks, utilizing feature selection, resampling and cost-...
متن کامل